NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Gradient Flows on Graphons: Existence, Convergence, Continuity Equations

https://doi.org/10.1007/s10959-023-01271-8

Oh, Sewoong; Pal, Soumik; Somani, Raghav; Tripathi, Raghavendra (July 2023, Journal of theoretical probability)

Wasserstein gradient flows on probability measures have found a host of applications in various optimization problems. They typically arise as the continuum limit of exchangeable particle systems evolving by some mean-field interaction involving a gradient-type potential. However, in many problems, such as in multi-layer neural networks, the so-called particles are edge weights on large graphs whose nodes are exchangeable. Such large graphs are known to converge to continuum limits called graphons as their size grows to infinity. We show that the Euclidean gradient flow of a suitable function of the edge weights converges to a novel continuum limit given by a curve on the space of graphons that can be appropriately described as a gradient flow or, more technically, a curve of maximal slope. Several natural functions on graphons, such as homomorphism functions and the scalar entropy, are covered by our setup, and the examples have been worked out in detail.
more » « less
Full Text Available
Meta-learning for Mixed Linear Regression

Kong, Weihao; Somani, Raghav; Song, Zhao; Kakade, Sham; Oh, Sewoong (July 2020, International Conference on Machine Learning)
null (Ed.)
n modern supervised learning, there are a large number of tasks, but many of them are associated with only a small amount of labelled data. These include data from medical image processing and robotic interaction. Even though each individual task cannot be meaningfully trained in isolation, one seeks to meta-learn across the tasks from past experiences by exploiting some similarities. We study a fundamental question of interest: When can abundant tasks with small data compensate for lack of tasks with big data? We focus on a canonical scenario where each task is drawn from a mixture of k linear regressions, and identify sufficient conditions for such a graceful exchange to hold; there is little loss in sample complexity even when we only have access to small data tasks. To this end, we introduce a novel spectral approach and show that we can efficiently utilize small data tasks with the help of Omega(k^3/2) medium data tasks each with Omega(k^1/2) examples.
more » « less
Full Text Available
LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes

Kusupati, Aditya; Wallingford, Matthew; Ramanujan, Vivek; Somani, Raghav; Park, Jae Sung; Pillutla, Krishna; Jain, Prateek; Kakade, Sham; Farhadi, Ali (January 2021, Advances in neural information processing systems)

Full Text Available
Robust Meta-learning for Mixed Linear Regression with Small Batches

Kong, Weihao; Somani, Raghav; Kakade, Sham; Oh, Sewoong (January 2020, Advances in neural information processing systems)

A common challenge faced in practical supervised learning, such as medical image processing and robotic interactions, is that there are plenty of tasks but each task cannot afford to collect enough labeled examples to be learned in isolation. However, by exploiting the similarities across those tasks, one can hope to overcome such data scarcity. Under a canonical scenario where each task is drawn from a mixture of k linear regressions, we study a fundamental question: can abundant small-data tasks compensate for the lack of big-data tasks? Existing second moment based approaches show that such a trade-off is efficiently achievable, with the help of medium-sized tasks with k^1/2 examples each. However, this algorithm is brittle in two important scenarios. The predictions can be arbitrarily bad even with only a few outliers in the dataset; or even if the medium-sized tasks are slightly smaller with. We introduce a spectral approach that is simultaneously robust under both scenarios. To this end, we first design a novel outlier-robust principal component analysis algorithm that achieves an optimal accuracy. This is followed by a sum-of-squares algorithm to exploit the information from higher order moments. Together, this approach is robust against outliers and achieves a graceful statistical trade-off.
more » « less
Full Text Available

Search for: All records